Introduction

PH345: Winter 2025

Phil Boonstra

Course Introduction: Telling stories with data

Ex 1: Gapminder and Life Expectancy

Chapter 2 of Unwin (2024)

Gapminder identifies systematic misconceptions about important global trends and proportions and uses reliable data to develop easy to understand teaching materials to rid people of their misconceptions

https://www.gapminder.org/about/

What do you see?

completely flat lines in 19th century; spikes in various places; lots of spikes in 1940s; increases after 1950

Ex 2: Mapping COVID-19

https://www.esri.com/arcgis-blog/products/product/mapping/mapping-coronavirus-responsibly/

  • Chloropeth: graduated colors

  • Provinces are different sizes, different populations

  • Comparison across provinces is difficult

  • Bar chart of number of cases for each Chinese province

  • Is a map justified?

  • Blue replaces red. Less “emotive”

  • Rates replace totals

  • Hubei province rightly set apart from others

  • Dots representing 10 cases randomly placed in each province

  • Potential misleading conclusion that Hubei province was overwhelmed

  • Totals represented by proportional circles

  • Not adjusted for population

  • All areas represented, e.g. Macau and Hong Kong

  • Log-transformed totals

  • Importance of legend

  • Logarithm de-emphasizes extremely large values but risks over-emphasizing small values

  • Inappropriate ‘smoothing’ of data based upon geographic center

  • Epicenter (Hubei) is lost

  • Suggests all of eastern China was overwhelmed

  • Choice of projection

  • Web Mercator: up is always north. Distortions lead to risk of misinterpreting geographic area

  • Albers Equal Area preserves geographic area but can distort shape

Ex 3: Dr. John Snow’s map

(Gilbert, 1958)

https://www.jstor.org/stable/pdf/1790244.pdf

Course Objectives

  1. To understand the principles of effective and accurate graphical representation of different data types;

  2. To draw conclusions from graphical representations about relationships and trends in variables;

  3. To understand how graphical representations of data can be used to mislead or exaggerate relationships;

  4. To create and improve data visualizations using the R statistical environment;

References

Gilbert, E.W., 1958. Pioneer maps of health and disease in England. The Geographical Journal, 124(2), pp.172-183.

Unwin, A., 2024. Getting (more out of) Graphics: Practice and Principles of Data Visualisation. CRC Press.